Deep unfolding networks (DUNs) have proven to be a viable approach to compressive sensing (CS). In this work, we propose a DUN called low-rank CS network (LR-CSNet) for natural image CS. Real-world image patches are often well-represented by low-rank approximations. LR-CSNet exploits this property by adding a low-rank prior to the CS optimization task. We derive a corresponding iterative optimization procedure using variable splitting, which is then translated to a new DUN architecture. The architecture uses low-rank generation modules (LRGMs), which learn low-rank matrix factorizations, as well as gradient descent and proximal mappings (GDPMs), which are proposed to extract high-frequency features to refine image details. In addition, the deep features generated at each reconstruction stage in the DUN are transferred between stages to boost the performance. Our extensive experiments on three widely considered datasets demonstrate the promising performance of LR-CSNet compared to state-of-the-art methods in natural image CS.
translated by 谷歌翻译
Over-parameterization of deep neural networks (DNNs) has shown high prediction accuracy for many applications. Although effective, the large number of parameters hinders its popularity on resource-limited devices and has an outsize environmental impact. Sparse training (using a fixed number of nonzero weights in each iteration) could significantly mitigate the training costs by reducing the model size. However, existing sparse training methods mainly use either random-based or greedy-based drop-and-grow strategies, resulting in local minimal and low accuracy. In this work, to assist explainable sparse training, we propose important weights Exploitation and coverage Exploration to characterize Dynamic Sparse Training (DST-EE), and provide quantitative analysis of these two metrics. We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property. Experimental results show that sparse models (up to 98\% sparsity) obtained by our proposed method outperform the SOTA sparse training methods on a wide variety of deep learning tasks. On VGG-19 / CIFAR-100, ResNet-50 / CIFAR-10, ResNet-50 / CIFAR-100, our method has even higher accuracy than dense models. On ResNet-50 / ImageNet, the proposed method has up to 8.2\% accuracy improvement compared to SOTA sparse training methods.
translated by 谷歌翻译
Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark. Extensive experiments on different sizes and types of PLMs show that existing PLMs systematically lack conceptual knowledge and suffer from various spurious correlations. We believe this is a critical bottleneck for realizing human-like cognition in PLMs. COPEN and our codes are publicly released at
translated by 谷歌翻译
多模式性荧光脱氧葡萄糖(FDG)正电子发射断层扫描 /计算机断层扫描(PET / CT)已常规用于评估常见癌症,例如肺癌,淋巴瘤和黑色素瘤。这主要归因于以下事实:PET/CT结合了对PET肿瘤检测的高灵敏度和CT的解剖学信息。在PET/CT图像评估中,自动肿瘤分割是重要的一步,近年来,基于深度学习的方法已成为最新方法。不幸的是,现有的方法倾向于过度细分肿瘤区域,并包括正常摄取器官,炎症和其他感染等区域。在这项研究中,我们引入了一个假阳性还原网络以克服这一限制。我们首先引入了一个自制的预训练的全球分割模块,以使用自我监督的预训练的编码器粗糙地描绘候选肿瘤区域。然后,通过局部细化模块去除假阳性来完善候选肿瘤区域。我们对MICCAI 2022自动病变分割的实验在全身FDG-PET/CT(AUTOPET)挑战数据集中表明,我们的方法在初步测试数据中获得了0.9324的骰子得分,并在排行榜上排名第一。我们的方法在最终测试数据的前7位方法中也排名,最终排名将在2022 MICCAI AUTOPET研讨会期间宣布。我们的代码可在以下网址提供:。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在上找到。此挑战的首页是在。
translated by 谷歌翻译
质量功能表示是实例图像检索的关键。为了实现这一目标,现有方法通常诉诸于在基准数据集上预先训练的深度模型,或者使用与任务有关的标记辅助数据集微调模型。尽管取得了有希望的结果,但这种方法受两个问题的限制:1)基准数据集和给定检索任务的数据集之间的域差距; 2)无法轻易获得所需的辅助数据集。鉴于这种情况,这项工作研究了一种不同的方法,例如以前没有得到很好的研究:{我​​们可以学习功能表示\ textit {特定于}给定的检索任务以实现出色的检索吗?}我们发现令人鼓舞。通过添加一个对象建议生成器来生成用于自我监督学习的图像区域,研究的方法可以成功地学习特定于给定数据集的特定特征表示以进行检索。通过使用数据集挖掘的图像相似性信息来提高图像相似性信息,可以使此表示更加有效。经过实验验证,这种简单的``自我监督学习 +自我促进''方法可以很好地与相关的最新检索方法竞争。进行消融研究以表明这种方法的吸引力及其对跨数据集的概括的限制。
translated by 谷歌翻译
在这份技术报告中,我们简要介绍了ACM-MM 2022中的PIC化妆视频接地(MTVG)挑战的团队“ PKU-WICT-MIPL”的解决方案。给定未修饰的化妆视频和步骤查询,MTVG Aims是要在视频中定位目标化妆步骤的时间瞬间。为了解决这项任务,我们提出了一个短语关系挖掘框架,以利用与细粒度和整个句子相关的时间定位关系。此外,我们建议限制不同步骤句子查询的本地化结果,以免通过动态编程算法相互重叠。实验结果证明了我们方法的有效性。我们的最终提交在排行榜上排名第二,从第一个方面只有0.55 \%的差距。
translated by 谷歌翻译